As with normal ciphers, there is a trivial brute-force attack which can find a collision in any hash function
Usually, we are not particularly worried about this attack because it takes
To illustrate the attack we are going to answer the following question: given
We assume that each birthday date is equally likely and that we are only working with the
the probability of a collision is the negation of the probability that there is no collision amongst the
Imagine the people entering the room one by one (or equivalently, the messages being generated independently one after the other). The probability that there is no collision in the birthdays of the
This is true because if there were no collisions in the first
The 1 at the beginning represents the probability that the first person's birthday does not collide with someone's else when entering the room, which is 100%, since there are no other people in the room until the first one enters. This probability can be rewritten as the following product:
Therefore, the probability that a collision does occur can be written as
We are now going to use a well-known inequality (we are going to take it for granted because proving it is out of scope), namely that
What is nice about exponential functions with the same base is that when multiplying them, the exponents simply add, yielding
The function
Recall that the left-hand side is smaller than the probability of a collision. Therefore,
While we did not obtain an exact equation for the value of
Given $q$ elements which are uniformly and independently chosen from a set of $B$ possible elements, the probability that two elements are the same is at least $1 - e^{-\frac{q^2}{B}}$.
Now let's put the theorem to work. How many people do we need in the room in order for there to be 50% chance that two of them share a birthday? Well, plug in
Solving this equation yields
If we have a hash function
The naive birthday attack does precisely this. First, it chooses
This variation is called naive because it has a huge space complexity, namely
Since the birthday attack is universal and works for any hash function, it is used instead of the simple brute force attack as the gold standard when creating security proofs.
There is an improved version of the birthday attack which has approximately the same probability success and running time but only takes a constant amount of memory. This attack uses Floyd's cycle finding algorithm.
Begin by choosing a random initial message
Store the index
fn SmallSpaceBirthdayAttack()
{
let x_0 = random_binary_string();
let x = x_0;
let x' = x_0;
let i = 0;
while(true)
{
x = H(x);
x' = H(H(x'));
if (x = x')
{
break;
}
else
{
++i;
}
}
let x = x_0;
let x' = x_0;
for(let j = 0; j < i; ++j)
{
if (H(x) = H(x'))
{
return (x, x');
}
else
{
x = H(x);
x' = H(x');
}
}
}
This attack uses much less memory than the naive method because it only needs to store the initial value